feat(lifecycle): LCM Apply* pipeline (split A)#5
Conversation
Implements ports.LifecycleManager as a synchronous observe->decide->persist reducer. Every entrypoint runs the shared pipeline under a per-session lock: load canonical -> run the matching pure decider -> diff into a sparse merge-patch -> persist. Never polls, never writes the display status. - ApplyRuntimeObservation -> probe decider; always writes the runtime axis. - ApplySCMObservation -> open-PR / terminal-PR deciders (failed fetch is a no-op: failed probe != "no PR"). Open PRs write only the PR axis. - ApplyActivitySignal -> updates the activity axis + maps onto the session axis; only valid-confidence signals are authoritative. - OnSpawnCompleted -> runtime alive + handles to metadata; session stays not_started (display: spawning). - OnKillRequested -> SM's explicit terminal-write authority. - TickEscalations -> no-op stub (reaction/escalation engine is split B). Composition rule (#1): liveness owns the runtime + death axis; activity owns the working/idle/waiting axis. A healthy probe verdict writes the session axis only to recover a liveness-owned state, so it never clobbers an activity-owned needs_input/blocked. Activity is the mirror: it stays off the death axis. Detecting clear (#3): a non-detecting probe verdict clears stale detecting memory so the next probe reads no phantom Prior. Built/tested against in-memory fakes (LifecycleStore with full merge-patch + ExpectedRevision, recording Notifier/AgentMessenger). Per-session serialisation verified under -race. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
There was a problem hiding this comment.
Pull request overview
Implements the initial Lifecycle Manager (LCM) Apply* pipeline as a synchronous reducer: per-session serialized load canonical → run pure decider → diff into sparse merge-patch → persist, with table tests + an in-memory fake LifecycleStore to exercise merge-patch semantics.
Changes:
- Added
internal/lifecycle.Managerimplementingports.LifecycleManagerentrypoints for runtime probes, SCM observations, activity signals, and SM-reported outcomes. - Added a decide-bridge layer translating
portsfacts intointernal/domain/decideinputs, plus composition rules for which lane may write the session axis. - Added comprehensive table tests and a merge-patch-capable in-memory fake store (including
Detectingthree-way semantics andExpectedRevisionsupport).
Reviewed changes
Copilot reviewed 4 out of 4 changed files in this pull request and generated 5 comments.
| File | Description |
|---|---|
backend/internal/lifecycle/manager.go |
New LCM implementation: per-session locking + shared mutate pipeline + Apply*/On* entrypoints. |
backend/internal/lifecycle/decide_bridge.go |
Bridges ports facts to pure deciders; defines composition predicates + kill/runtime mappings. |
backend/internal/lifecycle/manager_test.go |
Table tests covering runtime/activity/SCM entrypoints, spawn/kill outcomes, and serialization behavior. |
backend/internal/lifecycle/fakes_test.go |
In-memory LifecycleStore fake applying merge-patch semantics for tests. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
- shouldWriteSessionRuntime: never resurrect a terminal session; an observation may refresh the runtime axis but must touch neither the session axis nor the detecting memory (gated in ApplyRuntimeObservation). - OnSpawnCompleted: error on an unseeded session instead of fabricating a partial record (SM must seed first — a missing seed is a contract violation). - OnKillRequested: no-op on an unknown/already-gone session (benign race) instead of fabricating a terminal record. - keyedMutex: reference-count entries and evict on last release so the lock map stays bounded in a long-running daemon. - runtimeSubstateFromFacts: map RuntimeProbeIndeterminate to RuntimeUnknown with a neutral reason, distinct from the probe_error of a failed probe. Adds tests for terminal non-resurrection, unseeded spawn-completed error, and unknown-session kill no-op. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
harshitsinghbhandari
left a comment
There was a problem hiding this comment.
Review — LCM Apply* pipeline (split A), at 3945b10
Verified independently (throwaway worktree): gofmt clean, build/vet pass, go test -race green at 86.5% coverage. The five Copilot findings are all correctly addressed. This is a faithful, well-tested realization of the pipeline design — including the two contract follow-ups carried from the decide-core review.
Strong points (verified, not just read)
- #1 composition rule — the axis split is right: liveness owns the runtime + death/detecting axis, activity owns working/idle/waiting, and
isLivenessOwnedgates when a healthy probe may recover a state. The "healthy probe must not clobber an activity-ownedneeds_input" and "recovers a liveness-owneddetecting→ working" cases are both tested. - #3 detecting clearing — handled on every non-detecting path (probe, kill, terminal guard) via
setDetecting/ClearDetecting, and tested. - PR-via-derive — an open PR writes only the PR sub-state and lets
DeriveLegacyStatussurface it; this avoids two observers fighting over the session axis. Nicely done. keyedMutexrefcount/eviction is correct under concurrency (a waiter holds a ref before blocking, so an entry can't be evicted out from under it).TestPerSessionSerializationcovers it.
Follow-ups / decisions (none blocking)
- Activity can't resolve a
detectingsession (confirm intent).shouldWriteSessionActivityreturns false while liveness-owned, so even a high-confidencewaiting_input/blockedhook can't pull a session out ofdetecting— only the probe pipeline can. It does refresh the activity sub-state (which keepsdetectingalive rather than concluding death). Defensible as a flap-guard, but awaiting_inputhook is strong evidence the agent is alive — worth a deliberate decision on whether activity should be allowed to resolve a liveness ambiguity. - Merged/closed overrides the session axis unconditionally (modulo terminal), unlike the open-PR path which defers to activity. So a merge overrides a
needs_input/blockedactivity state. Probably intended (merge is a milestone), but worth a one-line comment on why merge supersedes activity. activityToSessionmaps Ready/Idle → reasonresearch_complete— a specific reason used as the generic idle reason; it'll read misleadingly in diagnostics. Consider a neutral idle reason.ExpectedRevisionunused (forward note). In-process per-session mutex is correct for a single daemon; if the store ever gets multiple writers (CDC/multi-process), the optimistic-CAS the contract provides would be needed. Fine for split A.- Terminal session still patches the runtime axis on every probe — harmless for display but generates revision/CDC churn; relies on the reaper not probing terminal sessions.
- Coverage 86.5% — load-bearing logic well covered; consider topping up the non-manual kill kinds (cleanup/error) and any uncovered SCM branches.
Recommendation: Approve, with #1 and #2 as quick design confirms and the rest as minor polish. Solid work.
Address Harshit's PR #5 review (approve w/ design confirms + polish): - #1 (design decision): a valid activity signal is proof of life, so it now resolves a detecting session — writes the activity-mapped session state and clears the quarantine memory. Scoped to detecting only; a liveness-escalated stuck stays the probe pipeline's to resolve. Terminal still never reopens. - #2: document why a merged/closed PR parks the session axis even over an activity-owned needs_input/blocked (a merge is a milestone), unlike the open-PR path that defers to activity. - #3: map plain idle activity to a neutral session reason instead of the misleading research_complete (kept for ready, which implies completion). - #6: cover all three kill kinds (manual/cleanup/error), the open-PR review branches (changes_requested/mergeable/review_pending), and the neutral idle reason. Coverage 86.5% -> 88.6%. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
Thanks for the thorough review. Addressed in
All gates green ( |
…change Defer findings #4 (isGitRepo case-sensitivity) and #5 (GitChecker seam) out of this PR. Restores the original exec-based isGitRepo and the New(store) constructor; removes git.go, git_test.go, and the test-only export shims. The error-standardization and other findings are unaffected. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…ayers (#95) (#96) * refactor(backend): LLD maintainability fixes in controllers/service layers Addresses the high + medium severity findings from the LLD review of backend/internal/httpd and backend/internal/service (#95): 1. Controllers no longer import internal/session_manager. Session sentinel errors are now *domain.ServiceError values carrying their own HTTP mapping, so the controller translates them with one generic errors.As — no cross-package sentinel imports. 2. One error pattern across services: project.Error is now an alias of the shared domain.ServiceError, and session_manager sentinels use it too. A single writeServiceError replaces the per-resource error switches. 3. Clean-orchestrator business logic moved out of the controller into session.Service.SpawnOrchestrator(ctx, projectID, clean). 4. isGitRepo no longer treats case-different paths as equal on case-sensitive filesystems; case-insensitive compare is gated to darwin/windows via samePath. 5. Project repo check sits behind an injectable GitChecker, so the service is testable without a real git binary. 6. httpd exports only the production constructors (NewWithDeps, NewRouterWithControl); the 3 test-only wrappers are removed and the "router with empty deps" convenience moved to an unexported test helper. Closes #95 Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> * refactor(backend): standardize service errors on internal/httpd/errors Replace the domain.ServiceError approach with a REST-API-scoped error package and a single envelope renderer, per review feedback: - Add internal/httpd/errors (package errors, aliased apierr): one structured Error type with semantic Kinds (Internal/Invalid/NotFound/Conflict) and constructors. Imports nothing, so any layer can depend on it. - envelope.WriteError is now the single path from a service error to the wire APIError, and the only place a Kind becomes an HTTP status/word. The per-resource writeProjectError/writeSessionError translators are gone. - Delete domain/errors.go (keeps domain pure of HTTP-flavored kinds) and service/project/errors.go (no per-service error files); services build errors inline via apierr constructors. - session_manager sentinels are apierr.Error values (pointer identity still works with errors.Is). Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> * revert(backend): drop GitChecker seam and isGitRepo case-sensitivity change Defer findings #4 (isGitRepo case-sensitivity) and #5 (GitChecker seam) out of this PR. Restores the original exec-based isGitRepo and the New(store) constructor; removes git.go, git_test.go, and the test-only export shims. The error-standardization and other findings are unaffected. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> * refactor(session): translate engine errors to API errors at the facade The session_manager is the internal command engine and must not depend on the REST API error vocabulary. Revert its sentinels to plain errors.New values and move the engine→API translation into the service/session facade (toAPIError), which is the correct boundary. Controllers still see apierr.Error and never import the engine; the engine no longer imports internal/httpd/errors. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> * docs(session): tighten error comments to state what the code does Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> * style(envelope): make KindInternal an explicit case in httpStatus Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> * refactor(apierr): rename package, test SpawnOrchestrator, parity fixes Address review feedback on PR #96: - Rename internal/httpd/errors → internal/httpd/apierr (package apierr) so importers no longer alias around the stdlib errors package. - Add a commander seam to session.Service and unit-test the relocated clean-orchestrator rule: clean=true kills all active orchestrators before spawning; clean=false spawns without kills. - project.Add: wrap the UpsertProject store error in apierr.Internal for parity with its sibling paths (was a raw 500). - Document that KindInternal is iota's zero value, so a zero-value Error defaults to 500. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.8 <noreply@anthropic.com>
…viewer to worker harness Per PR #197 review feedback: - Reviewer agent posts its review to the PR itself, so remove the ports.PRReviewPoster port, the GitHub review poster, the submit HTTP route + DTO, and the service Submit method (#1, #4, #7). - Trigger spawns the reviewer agent over the worker's worktree with its own review prompt, mirroring the session launch flow (resolve agent by harness -> argv -> runtime.Create) (#8, #9). - Default reviewer harness reuses the worker's harness when supported, falling back to claude-code; reviewer config stays independent of the worker override (#5, #6). - Drop the `ao review` CLI for this PR's scope (#2, #3). Regenerated OpenAPI + TS types. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
* feat(review): configurable AO code review backend (V1)
Add per-project configurable code review of a worker's PR. A reviewer
agent runs one-shot over the worker's own worktree and posts its result
to the PR; the worker picks the feedback up through the existing SCM
observer review-nudge path.
- domain: ProjectConfig.reviewers (+ default reviewer harness), Review /
ReviewRun types and verdict/status vocab.
- storage: review + review_run tables (0011), sqlc queries, store methods.
- service/review: rewrite the in-memory stub as a persisted ReviewService
(Trigger/Submit/List) with a reviewer Runner over agent resolver +
runtime; ports.PRReviewPoster implemented on the GitHub adapter.
- http: session-scoped routes POST /sessions/{id}/reviews/trigger,
POST .../submit, GET .../reviews; regenerated OpenAPI + TS types.
- cli: ao review trigger|submit|list.
- frontend: adapt ReviewDashboard to the per-worker reviews API.
Closes #192
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
* refactor(review): address review — drop submit/poster/CLI, default reviewer to worker harness
Per PR #197 review feedback:
- Reviewer agent posts its review to the PR itself, so remove the
ports.PRReviewPoster port, the GitHub review poster, the submit HTTP
route + DTO, and the service Submit method (#1, #4, #7).
- Trigger spawns the reviewer agent over the worker's worktree with its
own review prompt, mirroring the session launch flow (resolve agent by
harness -> argv -> runtime.Create) (#8, #9).
- Default reviewer harness reuses the worker's harness when supported,
falling back to claude-code; reviewer config stays independent of the
worker override (#5, #6).
- Drop the `ao review` CLI for this PR's scope (#2, #3).
Regenerated OpenAPI + TS types.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
* feat(review): restore ao review submit (records verdict+body in AO)
Per maintainer request, bring back `ao review submit`. AO records the
reviewer's verdict and body on the review_run and marks the pass complete;
it does not post to GitHub — the reviewer agent posts its review to the PR
itself.
- storage: add review_run.body (0011), persist via Insert/UpdateReviewRunResult.
- service: restore Submit (no SCM poster) storing verdict + body.
- http: restore POST /sessions/{id}/reviews/submit + SubmitReviewInput.
- cli: ao review submit [worker] --verdict --body (worker from arg/--session/$AO_REVIEW_WORKER).
- runner: reviewer prompt instructs posting to GitHub and recording via ao review submit.
Regenerated OpenAPI + TS types.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
* refactor(review): move reviewer runner to its own package; sharpen prompt
Per PR #197 review:
- Move the concrete reviewer runner out of the service layer into a new
internal/review_runner package (package reviewrunner), beside other
orchestration packages like session_manager. The service keeps only the
Runner interface + RunSpec it depends on; the agent-resolver + runtime
launch flow lives in review_runner.
- Sharpen the reviewer prompt: tell the agent to diff against the PR base,
focus on high-confidence findings, post via `gh pr review`, and record
the result with `ao review submit`; review-only (no commits/edits).
- Add unit tests for the runner.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
* refactor(review): simplify review_run schema; provider-agnostic reviewer prompt
Per PR #197 review:
- review_run: status default 'running' (drop 'pending'), drop CHECK
constraints on status/verdict, drop the updated_at column and the
session/iteration index. Propagated through queries, domain, store,
service, and tests.
- Reviewer prompt no longer hardcodes GitHub/gh commands — it instructs the
agent to use whatever review tooling the provider offers, keeping the
flow extensible across SCM providers.
Regenerated sqlc + OpenAPI/TS.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
* refactor(review): launch reviewer before persisting the run
Trigger now spawns the reviewer agent first and then writes the review_run
with a status derived from the launch outcome (running on success, failed
if it never started), instead of inserting a running row and correcting it
to failed afterwards.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
* refactor(review): pluggable reviewer registry distinct from worker harnesses
Reviewers are now their own pluggable adapter set, separate from the worker
agent registry — adding a reviewer (claude-code today, greptile tomorrow) is
a one-line registration that does not widen the worker harness vocabulary,
and a worker harness does not automatically become a valid reviewer.
- domain.ReviewerHarness: a distinct vocabulary (AllReviewerHarnesses) with
its own IsKnown; ReviewerConfig/Review/ReviewRun use it. ResolveReviewerHarness
reuses the worker harness only when it is itself a supported reviewer, else
falls back to claude-code.
- ports.Reviewer: a reviewer-specific contract (ReviewCommand → argv + env)
that models one-shot / non-prompt CLIs natively instead of forcing every
reviewer through the worker's interactive GetLaunchCommand(Prompt:...).
- internal/adapters/reviewer: a separate registry + resolver (mirrors the
worker agent registry) with the claude-code reviewer adapter, which owns the
review prompt and reuses the worker claude-code launch construction.
- review_runner resolves via the reviewer registry (not the worker
AgentResolver) and merges AO_REVIEW_WORKER into the adapter's env.
- daemon wires the reviewer resolver. Registry/domain parity is test-enforced.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
* test(review): cover run-scoped reviewer submit
* fix(api): update generated review submit schema
* refactor(review): split core engine (internal/review) from API service
Move the review orchestration (Trigger/Submit/List, run-id generation,
deps, RunSpec/Runner, sentinels) into a transport-independent core package
internal/review (Engine). internal/service/review is now a thin API-flow
boundary: the controller-facing Manager interface + a Service that delegates
to the engine + error re-exports.
This keeps the service layer to API concerns and lets the same engine back a
future in-process CLI trigger without going through HTTP. review_runner now
depends on the core package; daemon builds the engine and wraps it in the
service. No API/schema changes.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
* feat(review): commit-aware trigger, reviewer handle for UI, no env vars
Reworks the review trigger lifecycle and drops env-based coupling:
- review_run gains target_sha (the reviewed commit) and drops iteration.
A repeat trigger for the same PR head short-circuits to the existing run.
- review gains reviewer_handle_id: the live reviewer pane's runtime handle,
reused across passes and exposed in the reviews API so the UI can attach
its terminal over /mux.
- Trigger flow: if a live reviewer pane exists and a new commit arrived,
message it to re-review; otherwise spawn a fresh reviewer. The run is
recorded only after the reviewer is launched.
- No environment variables: the reviewer adapter embeds the explicit
`ao review submit --session <w> --run <id>` command in the spawn prompt
and the re-review message. CLI submit requires --run/--session (no env
fallbacks).
- Merge review_runner into internal/review as a Launcher (spawn/notify/alive).
- Trigger returns 201 for a new pass, 200 when reusing an existing run.
Regenerated sqlc + OpenAPI/TS.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
* refactor(review): author the reviewer prompt centrally, not in the adapter
Mirror the worker model (session_manager builds the prompt; adapters just
place it via LaunchConfig.Prompt). The reviewer prompt now lives in
internal/review/prompt.go and is passed through ports.ReviewInvocation.Prompt;
the claude-code reviewer adapter just feeds inv.Prompt to its launch command
and returns it as the re-review message. One-shot CLI reviewers may ignore it.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
* refactor(review): split reviewer prompt into system+task, mirroring buildSpawnTexts
Mirror session_manager.buildSpawnTexts for the reviewer: a standing role goes
in the system prompt, the per-pass task (PR/commit + exact `ao review submit`
command) goes in the user prompt. internal/review/prompt.go now returns
(prompt, systemPrompt); both flow through ports.ReviewInvocation and the
claude-code adapter places them via LaunchConfig{Prompt, SystemPrompt}. The
re-review message reuses the per-pass prompt (role already established in the
running pane).
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
---------
Co-authored-by: Claude Opus 4.8 <noreply@anthropic.com>
Co-authored-by: Vaibhaav <user@example.com>
Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com>
Summary
Implements
ports.LifecycleManageras the synchronous observe → decide → persist reducer (split A — the Apply* pipeline only). Every entrypoint runs one shared pipeline under a per-session lock: load canonical → run the matching pure decider (internal/domain/decide) → diff into a sparse merge-patch → persist. The LCM never polls and never writes the display status (derived on read).Base is the lane integration branch
feat/lcm-sm-contracts, notmain.Entrypoints
ApplyRuntimeObservation→ResolveProbeDecision. Always writes the runtime axis.ApplySCMObservation→ open-PR / terminal-PR deciders. A failed fetch (Fetched=false) is a no-op (failed probe ≠ "no PR"). Open PRs write only the PR axis; merged/closed also park the session.ApplyActivitySignal→ updates the activity axis + maps onto the session axis; onlyvalid-confidence signals are authoritative.OnSpawnCompleted→ runtime→alive + handles to metadata; session staysnot_started(displayspawning).OnKillRequested→ the SM's explicit terminal-write authority.TickEscalations→ documented no-op (reaction table + escalation engine are split B).PR #4 review follow-ups
workingverdict writes the session axis only to recover a liveness-owned state (detecting, or a death-inferred reason), so it never clobbers an activity-ownedneeds_input/blocked. The activity path is the mirror — it stays off the death axis.LifecyclePatch.ClearDetectingso staleStartedAt/EvidenceHashcan't leak into the next probe'sPrior(scoped to the probe pipeline).Serialization
Per-session mutex map (lazy
map[SessionID]*sync.Mutexbehind an outer lock): serial within a session, parallel across sessions, fully synchronous.ExpectedRevisionis implemented in the fake store but unused by the LCM (the mutex is the serializer).Scoping calls worth a look
none/indeterminate/ unfetched facts are clean no-ops (no write, no revision bump).Out of scope (later PRs)
Reaction table + escalation engine + real
TickEscalations; the Session Manager; real adapters.Notifier/AgentMessengerare held on the struct for the ACT lane but not invoked yet.Test plan
gofmt -l .(empty)go build ./...go vet ./...go test -race ./...Apply*and assert persisted canonical + derived display + detecting state (composition rule, detecting clear, terminal kill, SCM PR/terminal, spawn metadata).-race(50 concurrent signals, no lost update).LifecycleStoreexercises full merge-patch incl. three-wayDetecting/ClearDetectingandExpectedRevision.🤖 Generated with Claude Code